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SYSTEM AND METHOD OF ANALYZING AN HTML DOCUMENT FOR CHANGES 
SUCH THAT THE CHANGED AREAS CAN BE DISPLAYED WITH THE ORIGINAL 

FORMATTING INTACT 

Background of the Invention 

This invention relates generally to a system and method for comparing the differences 
between two documents and in particular to a system and method for comparing two hypertext 
markup language (HTML) documents and displaying the changed areas in the HTML documents 
5 while retaining the original HTML formatting. 

J The traditional method of locating document changes within pure text files is 

li H accomplished via a technique known as file differencing, or diffing. UNIX has a utility called 

m "difr that is used for file differencing. It worics by comparing each line hi a fu-st file (the Right 

File) with each line in a second file (the Left File). A carriage retum character typically separates 
tt each line from each other line. After the comparisons are finished, each line in the Right File 
Pi will be identified as having one of the following states: 

1 . unmodified - The current line exactly matches another line in the Left 
File; 

2. new - The current line has no match in the Left File; and 

15 3. modified - The current line nearly matches one of the hnes in the Left File 

with some changes. 

The unit of comparison, a line, is deliberately chosen because it is an intermediate amount 
of information. In other words, it is somewhat larger than a single character or word, and 
therefore offers a meaningful context for the detected change. However, a line is still small 
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enough so that the remainder of text, divided into lines, is considered separately and most Unes 



are often identical in both files. 



To better understand this typical differencing technique, consider a "diff operation, i.e., 
line-by-line comparison, of the text on the left and its revised version on the right where 
5 modified lines have been underlined in the right file for emphasis: 



La Nina continues in the Pacific 

Oceaa, meaning cooler than average 

sea surface temperatures along the 

equator north of South America. 
Q Typically this means a warmer and 

drier summer for the Midwest. The 
zl Sunraier of '99 has been very hot, 

^ with 32 days recording highs of 90 

f^i degrees or above, and very dry, with 

m rainfall deficits exceeding 4.5 feet 

m so far. 

^ B For this example, Table 1 below illustrates what you would see if the basis of comparison was a 
word (left column) vs. a line (right column). 

Table 1 - Changes Found After Comparing Text on a Word Basis vs. Line Basis 



Word Basis 


Line Basis 


west 
1999 
inches 


Equator west of South America 
Summer of 1999 has been very hot 
Rainfall deficits exceeding 4.5 inches 



As illustrated by the above example, the detected changes in a line by line based 
15 comparison (right column) are more usefiil for conveying the essence of the revisions than the 
detected changes when using a smaller unit comparison, such as a word based comparison. 



La Nina continues in the Pacific 
Ocean, meaning cooler than average 
sea surface temperatures along the 
equator west of South America . 
Typically this means a warmer and 
drier summer for the Midwest. The 
Summer of 1999 has been very hot, 
with 32 days recording highs of 90 
degrees or above, and very dry, with 
rainfall deficits exceeding 4.5 inches 
so far. 
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The World Wide Web (Web) is an international network of computers containing a vast 
amount of information. The hypertext markup language (HTML) is the lingua franca for 
pubhshing docxunents on the Web, The problem is that the typical differencing operations as 
described above do not work well for HTML documents. In particular, unlike pure text 
5 documents, or documents created using a word processor, carriage retums in HTML documents 
are not significant. In more detail, the width of lines displayed by a viewer will be determined by 
the width of the viewer window, not where carriage retums are entered in the HTML file. 
Therefore, a typical differencing operation that uses lines for a xmit of comparison does not work 
■ successfiiUy when comparing HTML files since the operation may unnecessarily identify 
||) differences which are insignificant. In addition, the HTML language treats contiguous sequences 
of white space characters as being equivalent to a single space character. Therefore, a contiguous 
sequence of white space characters is equivalent to a single white space character in the HTML 
^ f ^ language, but a typical differencing operation will not take this into account. 

i5 Due to the peculiar rules of the HTML language described above, the following are 

1 5 equivalent representations of the same paragraph in HTML document sources: 

Example 1a - HTML Paragraph 

<P> La Nina continues in tfie Pacific Ocean, meaning cooler tlian average sea surface 
20 temperatures along tfie equator west of Soutli America. Typically this means a warmer and drier 

summer for the Midwest. 

<\P> 
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Example lb - Equivalent Variation of HTML Paragraph 

<P> La Nina continues in the Pacific 
5 Ocean, 

meaning cooler than 

average sea surface temperatures along 

the equator west of South America. Typically this means a warmer and drier summer for 
the Midwest. 
10 </?> 

If a typical differencing operation is carried out on the two above paragraphs (which are 
considered to be identical in the HTML language), the differencing operation would find 
rj multiple differences since each line is compared character by character. It is thus apparent that 
W$ applying a typical differencing operation to these HTML formatted paragraphs would be 

ineffective, as it would identify every line as changed instead of recognizing these 
m representations as equivalent. Thus, it is desirable to provide a system and method for analyzing 
Q an HTML document for changes and for displaying the changed areas with the original HTML 
[JJ formatting intact and it is to this end that the present invention is directed. 

20 Summary of the Invention 

The World Wide Web (Web) is an international network of computers containing a vast 
amount of information. HTML is the lingua franca for pubUshing documents on the Web. This 
invention describes a computer-based method for analyzing two versions of an HTML document, 
which identifies new or changed areas of the document while preserving the original textual 
25 formatting. 
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Using the system and method in accordance with the invention, two versions of an 
HTML document are to be analyzed. The original version will be referred to as the Left File, 
while the updated version will be referred to as the Right File. Possible modifications of the Left 
File to produce the Right File might include the deletion of text, hypertext links, or embedded 
5 images; the modification of text, hypertext links, or embedded images; the insertion of text, 

hypertext links, or embedded images; or any combination thereof These document elements are 
usually the most interesting elements for users to monitor for changes, but any document element 
can be monitored for changes with this method, while preserving visual formatting in the vicinity 

■ :f of the change or changes. Examples of visual formatting inchide font type, font size, and use of 

So bold or itaUcs. 

HI In more detail, an HTML document may be scanned and the information organized into 

groupings of HTML tags and text. The system includes a set of rules for determining which 

I;H HTML tags are permitted within a group, and which HTML tags mark the start of a new group. 

S The tags that marie the start of a new group are usually those that break the flow of text when an 

1 5 HTML page is rendered. As a result, the text that constitutes a paragraph, embedded hypertext 
links, and any associated HTML character-formatting elements are contained within a single 
group. A modified version of the same HTML document is similarly processed. Once the 
processing is complete, the two HTML documents may be compared group by group in order to 
detect differences. Any group that does not match the associated group in the original is 

20 considered to be a modified group. The modified groups can then be inserted as sections into a 
new HTML document, and these sections appear to have nearly all of the original formatting 
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intact. Thus, the modified sections may appear as clipped sections from the original HTML 
document and are useful for depicting regions of interest. In addition to providing a new 
document with the clipped sections, a HTML page with the changed highlighted for the user may 
be displayed to the user. 

Brief Descri ption of the Drawings 

Figure 1 is a diagram illustrating an example of a computer-based system that may be 
used to execute the HTML normaUzation method in accordance with the invention; 

Figure 2 is a diagram illustrating a computer-implemented HTML normaUzation and 
comparison system in accordance with the invention; 

Figure 3 is a flowchart illustrating a method for HTML normaUzation in accordance with 
the invention; 

Figure 4 illustrates an example of a left file in accordance with the invention; 
Figure 5 illustrates an example of a right file in accordance with the invention; 
Figure 6 illustrates an example of a typical HTML file; 

Figure 7 illustrates the HTML file of Figure 6 after normalization in accordance with the 
invention; and 

Figure 8 illustrates an example of the comparison results file being displayed to the user. 
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Detailed Description of a Preferred Embodiment 

The invention is particularly applicable to a personal computer based system and method 
for normalizing and comparing HTML documents and it is in this context that the invention will 
be described. It will be appreciated, however, that the system and method in accordance with the 
5 invention has greater utility, such that it may be implemented using other types of computer 
systems, such as a cUent/server type system or any other computer-based system and may be 
used with other formatted files. 

ri Figure 1 is a diagram illustrating an example of a computer-based system 10 that may be 

used to execute the HTML normalization method in accordance with the invention. In this 
16 example, a typical personal computer is shown although the system and method in accordance 
m with the invention may also be implemented on other different types of computer systems, such 
O as client/server systems, local area networks and the like. The computer 10 may include a 
JIf display unit 12, a main processing unit 14 and one or more input^output devices 16. In this 

example, the one or more input/output devices may include a keyboard 18 and a mouse 20. In 
15 accordance with the invention, the input/output devices may also include, for example, a printer. 

The display unit 12 may be any typical display device, such as a cathode ray tube, a liquid 

crystal display or the like. 

The main processing unit 14 may further include a central processing unit (CPU) 22, a 
memory 24 and a persistent storage device 26 that are intercoimected together. The CPU 22 may 
20 control the operation of the computer and may execute one or more software applications, such 
as the HTML normalizer and comparer in accordance with the invention. The software 
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applications may be stored permanently in the persistent storage device 26 that stores the 
software apphcations even when the power is off and then loaded into the memory 24 when the 
CPU is going to execute the particular software application. The persistent storage device 26 
may be a hard disk drive, an optical drive, a tape drive or the like. The memory may be a 
5 random access memory (RAM), a read only memory (ROM) or the like. In operation, a 

normalization and comparing software application may be stored in the persistent storage device 
and, based on user input, loaded into the memory to be executed by the CPU. The normaUzer 
and comparer system in accordance with the invention may normalize the HTML documents, as 
described below, into one or more blocks of information and then compare the blocks of 
lO information to each other in order to accurately compare the HTML documents and maintain the 
HI formatting of the HTML documents during the comparison. Now, more details of the 

normalization and comparison system in accordance with the invention will be described. 

::H Figure 2 is a diagram illustrating a computer-implemented HTML normaUzation and 

J comparison system 30 in accordance with the invention. Although a software application 
15 implemented system and method in accordance with the invention is described herein, the system 
and method may also be implemented in hardware. The system 30 may include one or more 
software apphcation modules that may be executed by the CPU (See Figure 1) in order to 
perform the fimctions of the system in accordance with the invention. The system 30 may 
include an HTML normalizer module 32, a rules database 34 and a comparer 36. The normalizer 
20 module 32 may convert a first HTML document (HTML #1) and a second HTML document 
(HTML #2) into a normalized first and second document (RIGHT and LEFT) based on one or 
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more normalization rules that may be stored in the rules database 34. The normaUzation of the 
two HTML docximents permits those documents to be compared by a typical line comparison 
module 36 and then to display the results of the comparison while maintaining the formatting on 
the HTML documents. In general, the normalization may involve the conversion of the HTML 
5 docixment into one or more blocks of information wherein each block of information may be 
treated as a single line for purposes of the comparison. Thus, the normaUzation permits a typical 
line based comparison module to be used to accurately compare two HTML documents. More 
details of the normaUzation and the normalization rules in the rules database wiU now be 
;;^f described. 

SO Figure 3 is a flowchart ilhistrating a method 40 for HTML document normalization in 

HI accordance with the invention so that two HTML documents may be compared to each other 
using typical comparison systems while maintaining the formatting of the HTML documents. In 
step 42, the entire HTML document is scanned and any HTML head element are removed from 
n the document. In step 44, the HTML document is scanned again and any references to scripts in 
15 the HTML document are removed. In step 46, the HTML document is scanned again and any 
intradocument links are removed. In step 48, the HTML document is scanned again and any 
relative URLs in the HTML document are converted into absolute URLs, These rules in steps 42 
- 48 provide special handling of HTML elements which are not valid when removed from the 
context of the original page. 

20 Now, the entire HTML document is scanned again on a character by character basis to 

complete the normalization process. In particular, in step 50, the next character in the document 
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is retrieved. Next, in step 52, the method determines if a preformatted text character sequence 
(/PRE) has been located by scanning several characters. If the preformatted text tag has not been 
located, then character by character processing in step 54 occurs. The character by character 
processing will be described below in more detail. If a preformatted text tag has been located, 
5 then the method skips step 54 so that none of the character by character processing is carried out 
on the preformatted text. With the test in step 52, once an end tag for the preformatted text is 
located, the character by character processing may be resumed. Next, in step 56, the method 
determines if there are more characters to analyze and loops back to step 50 to get the next 
^ ;!f character or the normalization process is completed. 

lo The character by character processing may occur by applying multiple different rules to 

i ll each character. The rules may include removing the carriage returns fix)m the HTML document, 
converting multiple white spaces in the document into single white spaces, separating any block 
!;H level HTML elements from each other onto separate lines by inserting a carriage return before 
Pi the start tag so that each block in the HTML document is treated as a separate line for 
15 comparison purposes, and keeping any text level HTML elements on the same line. It should be 
noted that text level HTML elements don't cause paragraph breaks when rendered into a 
displayable form in a web browser. Those text level elements that define character styles can 
generally be nested as long as they contain other text level elements but not block level elements 
since block level elements are placed on separate lines. In accordance with the invention, the 
20 text level elements may include, for example, font style elements, phrase elements, form fields, 
A (anchor) elements, IMG elements (e.g., an inline image in an HTML document), APPLET 
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elements (e.g., Java Applets), FONT elements, BASEFONT elements, BR elements (e.g., line 
breaks in the HTML document), and MAP elements (e.g., a client-side image map in the HTML 
document). In removing the white spaces, the method may encounter a first white space and 
store it and then throw away all subsequent white spaces until another character is encountered so 
5 that the multiple contiguous white space characters are converted into a single white space 
character. 

The normalization process has now been completed so that the two normahzed HTML 
documents may be compared by a typical line comparison operation while still maintaining the 
; H formatting of the HTML document. In summary, the invention establishes a method for 
Stp generating a normahzed form for an HTML document so that equivalent representations, once 
HI normahzed, will appear identical when analyzed via typical line differencing. Another result is 
that normahzation, when appUed as described, will organize block elements, which do not nest 
additional block elements, each on separate lines. This is important because it keeps hypertext 
S links and presentation elements, those that produce visual effects, together with textual content. 
15 Thus, when line differencing is apphed, the detected changes are at the block level. In addition, 
these block elements will be properly formed HTML, with textual formatting automatically 
included along with any associated text. 

As described above, there are additional rules that make reference to other areas within 
the HTML document, such as removing intradocument links and script references and converting 
20 relative URLs, These rules are not strictly necessary to facilitate the comparison process, but do 
avoid errors in the rendering stage when the changed blocks of HTML, inserted into the body of 
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a new HTML document, are rendered in the browser. The head element is also removed since 
this area of the HTML source does not contain displayable information, and so it isn't usually 
useful to report changes occurring here. 

Now, an example of two HTML documents being normalized in accordance with the 
5 invention will be described. Referring back to Examples la and lb above, if the rules of 
normalization in accordance with the invention are applied to Examples la and lb, the two 
representations would appear identical. In particular, for this example, all characters would be 
organized on a single line, including the <P> and </P> tags. 

In accordance with the invention, the changed areas that are discovered through the 
% differencing process appear as clipped regions from the original revised HTML document (Right 
ii File) when rendered in the browser. Instead of the clipped regions, a new HTML page with the 
O all of the original content plus the changes highlighted may also be displayed for the user. Now, 
^ H an example of the system and method in accordance with the invention will be described, 

''''' Figure 4 illustrates an example of a left file 60 and Figure 5 illustrates an example of a 

1 5 right file 62 wherein the two files are HTML pages that display information to the user. For 
purposes of a simple example, there is additional text in the second paragraph of the right iSle 
shown in Figure 5, beginning with the sentence formatted in bold. The second paragraph is the 
changed information that will be automatically identified by the system and method, along with 
its embedded formatting, in accordance with the invention. The system and method can 
20 obviously also be used to compare more complex HTML files. 
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Figure 6 illustrates an example of a typical HTML file 64 that represents the left file 
shown in Figure 4 while Figure 7 illustrates a normalized HTML file 66 that corresponds to the 
HTML file shown in Figure 6. As described above, various elements of the HTML file are 
removed, not to facihtate the comparison process, but to avoid errors in the rendering stage when 
the changed blocks of HTML are inserted into the body of a new HTML document for display in 
the browser. As also explained, each paragraph shown in Figures 4 and 5 is inside a block-level 
HTML element that is deliberately arranged on a single line. Thus, although various elements of 
the HTML page are removed, portions of the HTML are arranged on a single line so that the line- 
by-line comparison method may be used to find differences in the HTML documents. 

Figure 8 illustrates an example of the comparison results file 80 being displayed to the 
user. As shown, the file may be a HTML page that is displayed to the user. As shown, the 
formatting of the changed portion (part of which has a bold appearance) and the surrounding 
portions is maintained so that the user can see the new or changed information as it was 
originally intended to appear. 

While the foregoing has been with reference to a particular embodiment of the invention, 
it will be appreciated by those skilled in the art that changes in this embodiment may be made 
without departing fi-om the principles and spirit of the invention, the scope of which is defined by 
the appended claims. 
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CLAIMS : 



1 1 , A method for comparing a left formatted file to a right formatted file, comprising: 

2 detecting groups of characters in the left and right files; 

3 comparing a group in the right file to a corresponding group in the left file to identify a 

4 modified group wherein some portion of the group is different between the left file and the right 

5 file; and 

% generating a comparison result file containing the modified groups as sections of the 

iff comparison result file to maintain the fbrmattmg of the modified groups when placed in the 
comparison result file. 

2. The method of Claim 1 , wherein detecting the groups in the files fiirther 

J||) comprises detecting and distinguishing tags in the files to determine the groups in the files. 

11 3 . The method of Claim 2, wherein the files are HTML files and the tags are HTML 

12 tags. 

13 4. The method of Claim 1 fiirther comprising displaying the comparison result file to 

14 a user so that the user views the changed portions of the right file with the formatting intact. 

15 5 . The method of Claim 1 , wherein detecting the groups fiirther comprises 

16 normaUzing the right file and left file based on one or more rales in a rales database to permit 

17 line-by-line comparison of the right and left file despite the formatting in the files. 
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IS 6. The method of Claim 5, wherein the nomiahzation further comprises one or more 

19 rules for handUng special elements in the files that would inhibit the line-by-Une comparison of 

20 the file. 

21 7. The method of Claim 6, wherein handling the special elements fiirther comprises 

22 one or more of removing header tags firom the files, removing script references fi*om the files, 

23 removing intradocument links firom the files, and converting relative URLs into absolute URLs 

24 in the file. 



II 8. The method of Claim 5, wherein the comparison fiirther comprises comparing the 

i6 right file to the left file on a line-by-line basis wherein block level HTML elements in each file 

if are treated as separate lines during the comparison. 

p 9, The method of Claim 8, wherein the normalization fiirther comprises processing 

^9 each character of the right and lefl: files, 

iD 1 0. The method of Claim 9, wherein the character processing fiirther comprises 

3 1 detecting a preformatting start tag when scanning the document and skipping the pre-formatted 

32 text contained between the start tag and a preformatting end tag. 

33 11. The method of Claim 1 0, wherein the character processing fiirther comprises one 

34 or more of removing carriage returns, converting multiple white spaces into a single white space, 

35 separating block level HTML elements into separate lines by inserting carriage retums before a 

36 block level start tag, and keeping text level tags on same line. 
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37 12. A system for comparing a left formatted file to a right formatted file, comprising: 

38 means for detecting groups of characters in the left and right files; 

39 means for comparing a group in the right file to a corresponding group in the left file to 

40 identify a modified group wherein some portion of the group is different between the left file and 

41 the right file; and 

42 means for generating a comparison result file containing the modified group as sections 

43 of the comparison resuh file to maintain the formatting of the modified groups when placed in 
0 the comparison result file. 

Ill 13, The system of Claim 12, wherein the detecting means further comprises means for 

is detecting and distinguishing tags in the files to determine the groups in the files. 

3? 14. The system of Claim 1 3, wherein the files are HTML files and the tags are HTML 

iS tags. 

49 15. The system of Claim 1 2 further comprising means for displaying the comparison 

50 result file to a user so that the user views the changed portions of the right file with the 

5 1 formatting intact. 

52 16. The system of Claim 12, wherein the detecting means further comprises a 

53 normalizer for processing the right file and left file based on one or more rules in a rules database 

54 to permit line-by-line comparison of the right and left file despite the formatting in the files. 
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55 17. The system of Claim 16, wherein the noraializer fiirther comprises one or more 

56 rules for handUng special elements in the files that would inhibit the line-by-line comparison of 

57 the file. 

58 18. The system of Claim 17, wherein the normahzer further comprises one or more of 

59 removing header tags fi-om the files, removing script references fi-om the files, removing 

60 intradocnment links fi-om the files, and converting relative URLs into absolute URLs in the file. 

61 19. The system of Claim 1 6, wherein the comparing means further comprises means 

62 for comparing the right file to the left file on a line-by-line basis wherein each block in each file 
fl is treated as a Une during the comparison. 

20. The system of Claim 1 6, wherein the normahzer further comprises a character 

6^ processor that processes each character of the right and left files. 

S6 21 . The system of Claim 20, wherein the character processor further comprises means 

Wl for detecting a preformatting start tag when scanning the document and means for skipping the 

68 pre-formatted text contained between the start tag and a preformatting end tag. 

69 22. The system of Claim 2 1 , wherein the character processor further comprises one or 

70 more of means for removing carriage returns, means for converting multiple white spaces into a 

71 single white space, means for separating block level HTML elements into separate lines by 

72 inserting carriage retums before a start tag, and means for keeping text level tags on same line. 
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ABSTRACT OF THE DISCLOSURE 

The invention is a computer-based method for analyzing two versions of an HTML 
document that identifies new or changed areas of the document while preserving the original 
textual formatting, including embedded graphics. An HTML document is scanned and the 
information organized into groupings of HTML tags and text. A set of rules determines which 
HTML tags are permitted within a group, and which mark the start of a new group. Tags that 
mark the start of a new group are usually those that break the flow of text when an HTML page 
is rendered. As a result, the text that constitutes a paragraph, embedded hypertext links, and any 
associated HTML character-formatting elements are contained within a single group. A 
modified version of the same HTML document is similarly processed. At this point, the two can 
be compared group by group in order to detect differences. Any group that does not match the 
associated group in the original is considered to be a modified group. The modified groups can 
then be inserted as sections into a new HTML document, and these sections appear to have 
nearly all of the original formatting intact. Thus, they appear as clipped sections fi'om the original 
HTML document, and are usefiil for depicting regions of interest. 
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Intelliseek's mission is to add inteliigence to the Internet. 
The Company is a leading provider of intelligent search, 
tracking and personalization infrastructure services to 
e-businesses. 

Soitnare infrastructure sen/ices add intelligence to portals. 
Intelligent portals understand their target audience and their 
specific requirements on an ongoing basis. 

Intelliseek services include: integrated search services 
(news, database searching, shopping, metasearching, and 
vertical searching); personalization services (personalized 
content; adaptive user profiling) and tracking services 
(page & site tracking, topic tracking and customized 
tracking). 
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Intelliseek's mission is to add intelligence to the Internet. 
The Company is a leading provider of intelligent search, 
tracking and personalization infrastructure services to 
e-businesses. 

Software infrastructure services add intelligence to portals. 
Intelligent portals understand their target audience and their 
specific requirements on an ongoing basis. Inteiliseek 
offers a suite of services that can be licensed by 
partners. These services enable companies to build or 
enhance a Website with personalized, highly relevant 
content and high-value Intemet services to cost effectively 
increase customer access and retention. 

Inteiliseek services include: integrated search services 
(news, database searching, shopping, metasearching, and 
vertical searching); personalization services (personalized 
content; adaptive user profiling) and tracking services 
(page & site tracking, topic tracking and customized 
tracking). 
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Software infrastructure services add intelligence to portals. Intelligent portals 
understand their target audience and their specific requirements on an ongoing basis. 
Intelliseek offers a suite of services that can be licensed by partners. These 
services enable companies to build or enhance a Website with personaUzed, highly 
relevant content and high-value Internet services to cost effectively increase customer 
access and retention. 
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DECLARATION AND POWER OF ATTORNEY 



DECLARATION: 



As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name. 

I beheve, I am the original, first and sole inventor (if only one name is Usted below) or an 
original, first and joint inventor (if plural names are listed below) of the subject matter which is 
claimed and for which a patent is sought on the invention entitled: 

SYSTEM AND METHOD FOR ANALYZING AN HTML DOCUMENT FOR CHANGES 
SUCH THAT THE CHANGED AREAS CAN BE DISPLAYED WITH THE ORIGINAL 
FORMATTING INTACT 

the specification of which (check only one item below): 

X is attached hereto. 

was filed as United States AppUcation 

Serial No. on 

and was amended on (if applicable). 

was filed as PCT international application 

Number on 

and was amended under PCT Article 1 9 

on (if applicable). 

I hereby state that I have reviewed and understand the contents of the above-identified 
specification, including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the examination of 
this appUcation in accordance with Title 37, Code of Federal Regulations, §L56(a). 

I hereby claim foreign priority benefits under Title 35, United States Code, §1 19 of any 
foreign application(s) for patent or inventor's certificate or of any PCT international 
application(s) designating at least one country other than the United States of America Usted 
below and have also identified below any foreign appUcation(s) for patent or inventor's 
certificate or any PCT intemational application(s) designating at least one country other than the 
United States of America filed by me on the same subject matter having a filing date before that 
of the application(s) on which priority is claimed: 
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PRIOR FOREIGN/PCT APPLICAI 


TON(S) AND ANY PRIORITY 


CLAIMS UNDER 35 U.S.I 


C. 119: 


Country 
(If PCT, indicate PCX) 


Application Number 


Date 
Filed 


Priority 
Claimed 
(Yes/No) 



















I hereby claim the benefit under Title 35, United States Code, §120 of any United States 
application(s) or PCT international application(s) designating the United States of America that 
is/are listed below and, insofar as the subject matter of each of the claims of this application is 
not disclosed in that/those prior application(s) in the manner provided by the first paragraph of 
Title 35, United States Code, §1 12, 1 acknowledge the duty to disclose material information as 
defined in Title 37, Code of Federal Regulations, § 1.56(a) which occurred between the filing 
date of the prior application(s) and the national or PCT international filing date of this 
application: 



PRIOR U.S. APPLICATIONS OR PCT INTERNATIONAL APPLICATIONS DESIGNATING THE U.S. 
FOR BENEFIT UNDER 35 U.S.C. 120: 


U.S. APPLICATIONS 


STATUS (check one) 


us. APPLICATION NUMBER 


U.S. FILING DATE 


PATENTED 


PENDING 


ABANDONED 


60/154,966 






X 
























PCT APPLICATIONS DESIGNATING THE U.S. 








PCT APPLICATION NO. 


PCT FILING DATE 


U.S. SERIAL NUMBERS 
ASSIGNED (if any) 
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POWER OF ATTORNEY: 

As a named inventor, I hereby appoint the following attomey(s) and/or agent(s) with full power 
of substitution to act exclusively to prosecute this application and transact all business in the 
Patent and Trademark Office connected therewith: 

Barry N. Young (Reg. No. 27,744); Timothy W. Lohse (Reg. No. 35,255); Stephen E. Reiter (Reg. No. 31,192); 
Steven R. Sprinkle (Reg. No. 40,825); William N. Hulsey HI (Reg. No. 33,402); Tenrance A. Meador (Reg. No. 
30,298); Ramsey R. Stewart (Reg. No. 38,322); June M. Leam (Reg. No. 31,238); John Oskorep (Reg. No. 
41,234); Timothy N. Ellis (Reg. No. 41,734); David R, Stevens (Reg. No. 38,626); William G. Goldman (Reg. No. 
42,590); Derek J. Westberg (Reg. No. 40,872); Sheila Kirschenbaum (Reg. No. 44,835); Travis L. Dodd (Reg. No. 
42,491); Charles D. Gavrilovich, Jr. (Reg. No. 41,031); Gerald W. Maliszewski (Reg. No. 38054); Hayward A. 
Verdun (Reg, No. 43,223); Armando Pastrana, Jr. (Reg. No. 44,997); Richard M. Goldman (Reg. No. 25,585) 

All correspondence should be addressed to: 

Timothy W. Lohse 

GRAY GARY WARE & FREIDENRICH 
Patent Department - Hillview 
3340 Hillview Avenue 
Palo Alto, CA 94304 

All telephone calls should be directed to Timothy W. Lohse, telephone number (650) 
320-7400. 



Liventor's Full Name: 


CONNAUGHTON, Chris 


hiventor's Signature: 




Date: 




Residence: 
(City, State and/or country) 


West Chester, Ohio 


Citizenship: 


US 


Post Office Address: 


8234 Autumn Lane, West Chester, Ohio 45069 



